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I  Introduction 

In  a  previous  study  (Samejima,  RR-80-3) ,  it  has  been  observed 
that  Bayesian  estimation  has  a  characteristic  which  contradicts  the 
principle  of  objectivity  of  testing,  because  of  its  bias  caused  by 
the  effect  of  the  prior,  in  estimating  the  examinee's  ability  from 
his  or  her  performance  in  the  test.  Thus  a  fairly  widespread 
belief  among  psychologists  that  Bayesian  estimation  is  better  than 
the  maximum  likelihood  estimation  because  of  the  additional  information, 
the  prior,  and  because  of  the  fact  that  it  provides  us  with  finite 
estimates  for  all  the  possible  response  patterns,  should  be  seriously 
reconsidered  and  dismissed.  In  contrast  to  the  Bayesian  estimators, 
the  maximum  likelihood  estimator  has  the  characteristic  of  asymptotic 
unbiasedness,  and  it  has  been  shown  (Samejima,  1977a,  1977b,  1977c,  and 
RR-77~1)  that  this  unbiasedness  holds  as  a  good  approximation  even 
with  a  relatively  short  test  and  with  mediocre  values  of  the  test 
information  function  for  the  interval  of  ability,  or  latent  trait, 
of  our  interest. 

When  the  test  information  function  of  our  test  assumes 
reasonably  high  values  throughout  the  range  of  ability  of  our  interest, 
the  probability  with  which  an  examinee  obtains  one  of  the  two  extreme 
response  patterns,  i.c.,  the  set  of  the  lowest  item  scores  and  that 
of  the  highest  item  scores,  is  negligibly  small.  In  such  a  case,  it 
is  almost  certain  that  the  maximum  likelihood  estimator  provides  us 
with  finite  estimates  for  all  the  examinees  and  will  not  give  us  any 
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inconvenience  in  either  research  or  practice.  The  fact  of  the 
matter  is  that  we  should  construct  and  use  such  tests  for  both 
purposes.  In  practice,  however,  very  little  consideration  of  such 
nature  and  theoretical  insight  has  been  taken  in  constructing  tests 
and  the  subsequent  use  of  the  tests.  Many  researchers  and  users 
of  tests  casually  pick  up  existing  tests  and  use  them  for  varieties 
of  purposes,  and  blame  the  maximum  likelihood  estimation  for  the 
fact  that  it  provides  us  with  negative  and  positive  infinities  for 
some  examinees  as  their  ability  estimates,  and  turn  to  the  Bayesian 
estimation  simply  because  it  does  not  produce  infinities.  Scientific 
examination  reveals,  however,  that  this  is  nothing  hut  a  disguise; 
the  simple  fact  is  thaL  the  test  itself  fails  to  have  enough  power  to 
estimate  the  examinees'  ability  levels  (cf.  Samejima,  RR-80-3) . 

We  must  accept  the  fact  that  every  test  has  a  finite  range  of  ability 
for  which  it  can  estimate  ability  levels  accurately  enough,  and  avoid 
the  pretense  that  it  can  do  so  outside  of  that  range  of  ability  by  the 
use  of  such  an  inadequate  information  as  a  prior. 

V/ith  this  basic  understanding  in  mind,  a  question  will  arise 
as  to  whether  there  is  any  way  of  expanding  this  range  of  ability  a 
test  has,  without  turning  to  any  inappropriate  information,  and 
without  sacrificing  our  scientific  honesty  and  the  objectivity  of 
testing.  If  this  is  possible,  then  it  will  contribute  to  our  research 
and  practice,  since  we  could  use  a  wider  range  of  existing  tests  for 
our  purposes  with  our  appropriate  selections. 
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A  positive  answer  to  the  above  question  has  been  given 
(Samejima,  RR-80-3)  for  the  situation  in  which  the  number  of  test 
items  is  relatively  small,  by  proposing  an  alternative  pair  of 
estimates  for  the  two  extreme  response  patterns,  which  will  replace 
negative  and  positive  infinities  resulting  from  the  maximum  likelihood 
estimation.  The  present  study  is  a  continuation  of  the  previous 
study,  in  which  the  concept  of  these  alternative  estimates  is  expanded 
to  cover  the  situation  where  the  test  has  a  larger  number  of  test  items. 


II  Comparison  of  the  New  Estimate  0*  with  Seve ral  Other  Estimates 

Let  0  be  ability,  or  latent  trait,  which  assumes  any  real 
number,  such  that 

(2.1)  _  OO  <  Q  <  CD  # 


Let  g  (=l,2,...,n)  denote  an  item,  and  x  (=0,1,2, ... ,m  )  be 

g  6 

a  graded  item  response  to  item  g  .  The  operating  characteristic,  P  (6) 

Xg 

of  the  graded  item  response,  or  item  score,  x  is  defined  as  the 

g 

conditional  probability,  given  ability  0  ,  with  which  the  examinee 

obtains  the  item  score  x  for  item  g  .  In  the  normal  ogive  model, 

8 

this  operating  characteristic  is  defined  by 


(2.2) 


Px  (6)  - 
g 


(2it) 


-1/2 


<e-bx  ) 

g 


where  (>0)  is  the  item  discrimination  parameter  and 

item  response  difficulty  parameter  which  satisfies 


is  the 


(2.3) 


=  b  <  b  ,<  b. 


<  b  < 
m 


g 


b(m  +1) 

g 


CO 


Let  V  denote  the  response  pattern,  or  a  vector  of  n  item  scores  such 
that 


’  xn} 


(2.4) 


V'  =  (x-p  x2,  . .  .  , 
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By  the  assumption  of  local  independence  (Lord  and  Novick,  1968)  ,  the 
operating  characteristic  of  the  response  pattern,  Py(0)  ,  or  the 
conditional  probability,  given  ability  0  ,  with  which  the  examinee 
obtains  the  response  pattern  V  ,  is  the  simple  product  of  the  n 
operating  characteristics  of  the  graded  item  scores,  such  that 

(2.5)  P  (e)  =  HP  (0)  . 

The  maximum  likelihood  estimate,  8^  ,  of  ability  9  for  the  examinee 
whose  response  pattern  is  V  is  obtained  by  using  this  operating 
characteristic  PyO)  as  the  likelihood  function  1.^(0)  ,  or,  equivalently, 
as  the  solution  of  0  for  the  equation 


(2.6) 


y.  a  (o)  =  o  , 

x  tV  xg 
g 


where  A  (0)  is  the  basic  function  for  the  item  score  x  ,  which  is 
defined  by 


(2.7) 


Ax  0»  -  -55-  log  Px  <«>  ■ 
g  g 


The  item  response  information  function,  I  (0)  ,  for  the  item  score 

Xg 

x  is  obtained  from  the  basic  function,  or  directly  from  the  operating 
8 

characteristic,  by 


(2.8) 


I  (o)  = 
x 

g 


~ r-  A  (0) 
30  x 

g 


— r  log  pv  (0)  » 

30  g 


and  the  item  information  function,  I  (0)  ,  is  defined  as  the  conditional 
expectation  of  the  response  pattern  information  function,  given  0  , 
such  that 


(2.9) 


I  (0)  =  E[ I  (0) | 0]  =  E  I  0~)  P 
8  Xg  x  =0  Xg  '  g 

O 


We  can  write  for  the  response  pattern  information  function,  I„(6)  » 


such  that 


(2.10) 


V>  ’  -  ~r  Ve>  ■  E  (e)  ■ 

36  x  eV  g 

s 


and  the  test  information  function,  1(0)  ,  is  defined  as  the  conditional 
expectation  of  the  response  pattern  information  function,  given  6  , 
such  that 


(2.11) 


1(0)  =  Z  IV(Q)  PV(B) 


It  can  be  shown  that  the  test  information  function,  which  is  defined  by 
(2.11),  is  also  the  sum  of  the  n  item  information  functions,  so  that 
we  can  write 


(2.12) 


1(0)  --  >:  r  (o) 
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One  of  the  important  and  useful  characteristics  of  the  maximum 
likelihood  estimate  is  that,  asymptotically,  it  distributes  normally 
with  8  and  [1(6)]  “  as  the  two  parameters  (Samejima,  1975).  It 

has  been  shown  (Samejima,  1975,  1977a,  1977b,  1977c  and  RR-77-1)  that 


this  convergence  to  the  normality  of  the  conditional  distribution  of 


the  maximum  likelihood  estimate  is  fairly  fast,  and  even  with  a  relativel 
small  number  of  test  items  and  a  mediocre  amount  of  test  information 
this  asymptotic  normality  can  be  used  as  a  good  approximation  to  the 
conditional  distribution  of  the  maximum  likelihood  estimate,  given 
ability  0  ,  when  the  operating  characteristic  of  the  item  score  x 

o 

follows  the  normal  ogive  model.  If  the  number  of  items  is  too  small 
and  so  is  the  amount  of  test  information,  this  approximation  will  not 
hold,  however. 

Let  V-min  and  V-max  denote  the  two  extreme  response  patterns, 
such  that 


In  such  models  as  the  normal  ogive  model  and  the  logistic  model  on  the 
graded  response  level  (Samejima,  19(i‘>,  l‘>7?),  the  maximum  likelihood 
estimate  for  the  response  pattern  V-min  is  negative  infinity,  and 


that  for  V-max  is  positive  infinity.  In  such  a  situation  as  described 
in  the  preceding  paragraph,  the  probability  with  which  we  obtain  negative 
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or  positive  infinity  as  the  maximum  likelihood  estimate  is  no  longer 
negligibly  small.  The  approximate  unbiasedness  of  the  maximum  likelihood 
estimate,  therefore,  cannot  be  attained  in  such  a  situation.  The 


situation  wll  be  salvaged  if  we  define  a  pair  of  estimates,  0* 


V-min 


and  9*  ,  such  that 

V-max 


(2.14) 


1  2  2 

q*  =  [i(0  -  0  )  -  Z  0„ 

V-mm  2  c  -  v^V-min  V 

V^V-max 


Pv(6)  d 9 ] 


P„  .  (6)  d6]  1 
V-min 


9V-max  =  ^  ~  V  "  2  ®V  1  V6)  d6] 

V  max  2  c  V^V-min  J  6 

V^V-max  c 


P,.  (?)  de]^  , 

V-max 


where  6  and  0  are  the  lower  and  upper  endpoints  of  an  appropriately 

defined  interval  of  0  ,  and  0  is  a  critical  value  of  0  below  which 

c 

the  operating  characteristic  P  (0)  assumes  negligibly  small  values 
and  above  which  Pv-min^  assumes  negligibly  small  values,  and  use 
them  as  the  substitutes  for  the  negative  and  positive  infinities  of  the 
maximum  likelihood  estimate,  respectively  (cf .  Samejima,  RR-80-3) . 

Since  every  test  has  only  a  finite  number  of  items,  and  the  amount 
of  test  information  is  limited  (Samejima,  RR-79-1),  any  test  is  informative 
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enough  in  estimating  the  examinee's  ability  only  when  his  or  her  ability 

lies  within  a  subset  of  the  entire  range  of  ability.  This  subset  may 

be  a  single,  finite  interval  of  0  ,  or  a  set  of  several  intervals, 

depending  upon  the  combination  of  test  items  and  their  characteristics. 

In  many  cases,  however,  the  test  information  function  of  a  test  of  our 

interest  assumes  high  values  only  for  a  single,  finite  interval  of 

■ability  0  ,  and,  therefore,  the  subset  is  a  finite  interval,  as  is  the 

case  with  L.IS-U  (Indow  and  Samejimn,  1962,  1966),  which  was  used  in  the 

previous  study  (Samejimn,  RR-80-3)  as  an  example  of  a  short  test.  In 

such  a  situation,  the  interval,  (0,i')  ,  which  was  introduced  in  the 

preceding  paragraph,  can  be  considered  as  the  subset.  By  virtue  of 

the  substitute  estimates,  6*  and  0*  ,  for  the  negative  and 

V-min  V-max  & 

positive  infinities  of  the  maximum  likelihood  estimate,  this  subset,  or 

interval,  has  been  enlarged,  and,  moreover,  we  can  obtain  an  approximately 

unbiased  estimate  of  ability  for  this  range  of  ability.  We  define  the 

estimator  0*  such  that 

|=6*  for  V  =  V-min 

!  V-min 

(2.15)  Q*  /  =  6*  for  V  =  V-max 

V  \  V-max 

=  0  otherwise  . 


Hereafter,  we  shall  call  this  estimator  0*  the  modified  maximum 
likelihood  estimator. 


Table  2-1  presents  the  discrimination  parameter,  a  ,  and 

8 

the  difficulty  parameter,  b  ,  of  each  of  the  seven  binary  test  items 

8 


of  LIS-U  ,  which  follows  the  normal  ogive  model  on  the  dichotomous 


£>*•*> 
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response  level,  and  whose  item  characteristic  function  is  given  by  (2.2). 

The  test  information  function,  1(0)  ,  and  its  square  root,  of  LIS-U 

are  also  shown  as  Figure  2-1,  by  solid  and  dotted  lines,  respectively. 

Figure  2-2  presents  the  operating  characteristics  of  the  two  extreme 

response  patterns,  V-min  and  V-max  ,  of  LIS-U  by  solid  and  dotted 

lines,  respectively,  and  the  position  of  0^  by  an  arrow.  This  critical 

value  of  0  is  defined  as  the  point  at  which  the  product  of  P„  .  (0) 

V— min 

and  P  (0)  is  maximal. '  it  turned  out  that  0  =  -0.0088  ,  and 

V-max  c 

P,r  ,  (0  )  -  0.0027  and  P  (0  )  =  0.0031  .  These  values  satisfy 
V-min  c  V-max  c 

the  requirement  in  defining  0*  .  and  0*  that  for  6  <  0 

n  ^  V-min  V-max  c 

P  V-max*-  ^  assumes  negligibly  small  values  and  so  does  P\;  min^*^  for 
0  >  0c  (cf.  Samejima,  RR-80-3) . 

The  values  of  0*  and  0*  have  been  obtained  for  eleven 

V-min  V-max 

different  intervals  of  (0,0)  in  the  previous  study  (Samejima,  RR-80-3), 

and  the  regressions  of  the  estimate  6*  on  ability  0  have  also  been 

illustrated  for  these  eleven  cases.  It  has  been  observed  that  the 

approximate  unbiasedness  of  the  modified  maximum  likelihood  estimate 

6*  holds  better  for  smaller  intervals  of  (0,0)  ,  while  the  violation 

of  the  unbiasedness  becomes  more  conspicuous  as  the  interval  becomes 

larger.  Table  2-2  presents  the  values  of  0*  .  and  0*  obtained 

V-m  in  V-max 

upon  each  of  the  eight  smallest  intervals  of  (0,0)  ,  with  the  square 
root  of  the  test  information  function  at  0  =  0  and  0  =  B  ,  respectively, 
together  with  the  upper  bound  of  the  discrepancies  between  the  regression 

4. 

There  is  a  typographical  error  in  RR-80-3,  and  on  page  84,  line  2  from 
bottom,  "minimal''  should  be  replaced  by  "maximal." 
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LATENT  TRAIT  0 

FI CUKE  2-1 


Test  Information  Function  (Solid  Line)  and  Its  Square  Root 
(Dotted  Line)  of  LIS-U. 


the  Two  Extreme  Response  Patterns, 
1>1  )  (Dotted  Line),  of  LIS-ll  , 


TABLE  2-2 


6*  and  6*  Obtained  upon  Each  of  the  Eight  Smallest 

V-min  V-max  r  ° 

Intervals  of  (0,0)  ,  the  Square  Root  of  the  Test  Information 

at  Each  Endpoint  of  Each  Interval,  and  the  Upper  Bound  of  the 

Discrepancies  between  the  Regression  of  0*  on  6  and  9 

Itself,  for  LIS-U. 
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of  9*  and  the  true  value  of  0  .  We  can  see  in  this  table  that,  even 

for  the  smallest  interval,  (-1.50,  1.50)  ,  the  square  root  of  the  test 

information  assumes  values  as  low  as  1.44  and  1.49  ,  respectively, 

at  the  two  endpoints  of  the  interval,  and  yet  the  upper  bound  of  the 

discrepancies  is  considerably  low  for  the  first,  say,  five  intervals. 

This  fact  indicates  that,  in  spite  of  the  relatively  low  amounts  of 

test  information  of  LIS-U  ,  the  introduction  of  9*  .  and  6* 

V-mm  V-max 

has  succeeded  in  providing  us  with  an  approximately  unbiased  estimator, 

i.e.,  the  modified  maximum  likelihood  estimator  6*  ,  which  can  be 

used  for  a  fairly  large  interval  of  0  .  Since  the  least  finite  value 

of  the  maximum  likelihood  estimate  is  -1.3167  for  the  response  pattern, 

(0,0, 0,1, 0,0,0)  ,  and  the  greatest  finite  value  is  1.3028  for 

(1,1, 1,0, 1,1,1)  ,  the  pair  of  values,  -1.479  and  1.522  ,  obtained 

for  0*  .  and  0*  upon  the  interval,  (-1.50,  1,50)  ,  sounds 

reasonable  enough.  We  could  expand  the  interval,  however,  by  using 

one  of  the  other  pairs  of  estimates,  which  are  larger  in  absolute  values 

than  the  above  values  of  6*  .  and  0*  ,  and,  with  the  trade-off 

V-mm  V-max 

of  the  length  of  interval  against  the  upper  bound  of  the  discrepancies 
between  the  regression  and  the  true  value  of  9  ,  we  may  conclude  that 
we  should  use  one  of  the  first  five  intervals  of  (9,9)  ,  the  largest 
of  which  is  (-2.50,  2.50)  . 

In  the  present  study,  we  choose  the  interval  (-2.25,  2.25)  for 

(0,0)  ,  which  provides  us  with  -1.925  and  1.892  for  0*  .  and 

V-min 

^V-max  ’  resPectively.  As  we  can  see  in  Table  2-2,  this  selection 
assures  us  that  the  conditional  expectation  of  our  estimate,  given  ability 
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0  ,  does  not  differ  from  the  true  value  of  0  by  more  than  0.42  , 
at  any  point  of  the  interval  of  0  .  Figure  2-3  presents  the  regression 
of  0*  on  0  ,  which  is  given  by 

(2.16)  E(0*|e)  =  I  6*  P  (0)  , 

v  v  V  V 

by  a  solid  curve.  In  the  same  figure,  also  presented  is  the  standard 
error  of  estimate,  which  is  defined  by 


(2.17)  (E[(0*  -  E(0*|ft)]2}1/2 


=  (I  ID*  -  E(0* 
V 


9)]2  Pv(6)>] 


and  is  plotted,  vertically,  by  dots  in  both  negative  and  positive 
directions  from  the  regression.  There  is  a  straight  line  with  forty-five 
degrees  from  the  abscissa  of  the  figure,  which  indicates  the  unbiasedness, 
and,  hereafter,  we  shall  call  it  the  unbiasedness  line.  The  reciprocal 
of  the  square  root  of  the  test  information  function,  which  is  usually 
considered  as  the  standard  error  in  the  maximum  likelihood  estimation, 
is  also  plotted  by  dotted  lines,  vertically,  in  both  negative  and 
positive  directions  from  the  unbiasedness  line.  It  is  interesting  to 
note  that,  while  the  standard  error  for  the  maximum  likelihood  estimate 
increases  as  the  regression  diverts  from  the  center  of  the  interval, 
the  counterpart  for  the  modified  maximum  likelihood  estimate,  6*  , 
decreases,  and  the  two  dotted  curves  for  the  latter  and  the  regression 
itself  converge  to  0*  and  0*  ,  respectively,  as  0  tends 

to  negative  and  positive  infinities.  We  can  also  see  that,  for  almost 
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LATENT  TRAIT  9 

FIGURE  2-3 


Regression  of  the  Modified  Maximum  Likelihood  Estimate  0^  (Solid 
Curve)  Based  upon  the  Interval,  -2.23  i  0  2.25  ,  on  Ability  6  , 

for  I.IS-U.  The  Standard  Error  of  Estimate  Is  Plotted  by  Dots  in 
Roth  Vertically  Positive  and  Negative  Directions  from  the 
Regression,  As  a  Function  of  Ability  0  . 
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the  entire  range  of  the  interval,  the  unbiasedness  line  lies  within 
the  vertical  interval  of  the  standard  error  for  the  modified  maximum 
likelihood  estimate,  0*  , 

It  has  been  pointed  out  (SamejLma,  RR-80-3)  that,  unlike  the 

modified  maximum  likelihood  estimate,  0*  ,  any  Bayesian  estimate 

involves  the  bias  caused  by  the  prior,  which  contradicts  the  principle 

* 

of  the  objectivity  of  testing.  Let  0^  be  the  Bayes  modal  estimate 
for  a  specific  response  pattern  V  .  This  estimate  is  defined  as  the 
value  of  6  at  which  the  function  B  (0)  ,  which  is  given  by 

v 

(2.18)  By(e)  =  f ( 0)  Pv(6)  , 

assumes  the  maximal  value,  where  f(0)  is  the  density  function  of  6  , 
or  the  prior.  Figure  2-4  presents  the  regression  of  the  Bayes  modal 
estimate  0^  ,  which  is  obtained  by  replacing  v*  by  0^  in  (2.16), 
and  the  vertical  interval  similar  to  the  one  in  Figure  2-3,  with  the 
standard  error  of  estimate  obtained  by  replacing  0*  by  0^  in  (2.17), 
by  solid  and  dotted  lines,  respectively.  Comparison  of  this  figure 
with  Figure  2-3  reveals  that  the  vertical  interval  for  the  Bayes  modal 
estimate  6^.  contains  the  unbiasedness  line  only  for  a  much  smaller 
interval  of  0  ,  i.e.,  approximately  (-1.38,  1.41),  and  outside  this 
interval  both  the  vertical  interval  and  tlie  regression  converge  quickly 

*  i. 

to  (‘  .  -  -1.3617  and  0„  =  1  .3542  ,  respectively,  as  0 

V-nun  V-max  1 

tends  to  negative  and  positive  infinities. 
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FICURF  2-4 


Rer, r css i on  ol  Liu-  Haves  Nodal  I'.st  imate  i1  with  the  Prior  n(0,l) 

(Sol  ill  (airve)  on  Ah  i  1  i  tv  i'  ,  for  I.IS-h.  Tho  Standard  Frror  of 
imate  Is  Plotted  hv  hots  in  Both  Vertically  Positive  and 
Nor.nt  ivo  Direction:;  from  the  Kei’.ress  ion ,  As  a  Function  of 

Ah  i  1  i  t  v  . 
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We  shall  observe  another  estimate  and  its  regression  on  ability 

rf 

6  and  the  similar  vertical  interval  ior  the  standard  error  of  estimate. 
This  is  Bayes  estimate,  y|  ,  with  the  same  prior,  n(0,l)  ,  as  we 

* 

used  for  the  Bayes  modal  estimate,  0  .  This  estimator  is  defined  by 


(2.19) 


IV 


r 


o  fy(")  dc 


where  f  (v)  is  the  density  function  of  0  for  the  subgroup  of  examinees 
whose  response  patterns  are  uniformly  V  ,  which  is  given  by 


(2.20) 


fv(' )  =  f(0)  Pv(0)  [ 


J  f (0)  Pv(e)  de]_1 


Figure  2-5  presents,  for  the  Bayes  estimate,  u^y  ,  a  set  of  functions 
similar  to  those  which  we  have  observed  for  both  the  modified  maximum 
likelihood  estimace,  0*  ,  and  the  Bayes  modal  estimate,  ,  in 

Figures  2-3  and  2-4,  respectively.  They  are  the  regression  obtained 
by  replacing  Q*  by  in  (2.1b)  and  the  interval  based  upon  the 

standard  error  of  estimate  obtained  by  the  similar  replacement  in  (2.17). 
We  can  see  that  this  set  of  results  is  very  much  like  the  one  we  obtained 
for  the  Bayes  modal  estimate  with  the  same  prior,  n(0,l)  ,  which  we  have 
observed  in  Figure  2-4.  The  interval  of  0  for  which  the  vertical 
interval  of  the  standard  error  of  estimate  includes  the  unbiasedness  line 
is  approximately  (-1.40,  1.28)  ,  and  outside  of  this  small  interval  of 

0  the  three  curves  converge  quickly  to  y'  ,  .  =  -1.3764  and 

°  ^  J  lV-mm 


IV-max 


=  1.2695  ,  respectively. 
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FIGURE  2-5 


Regression  of  1 1 1 > ■  Hayes  Modal  I' s  I  iinnte  u^r  with  the  Trior  n(0,]) 

(Sol  ill  Curve)  on  Ability  ''  ,  for  I.TS-I'.  The  Standard  Error  of 
list  ini.it  e  Is  I’lolLeil  I'.v  Pol-,  in  both  Veil  ivally  Positive  and 
Mesa  I  ive  Hi  i  ei't  ions  l  rom  the  Roy rt's:;  i  on  ,  As  n  Function  of 

Ah  i  1  i t v  ■  . 
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The  similarity  observed  in  the  above  two  results  makes  us  wonder 
if  we  can  define  an  estimator  which  is  population-free,  and  has  an 
analogous  meaning  to  the  maximum  likelihood  estimator  as  Bayes  estimator 
has  to  the  Bayes  modal  estimator.  Such  an  estimator  may  be  a  better 


estimator  than  the  modified  maximum  likelihood  estimator,  0*  ,  or  may 
not.  To  find  it  out,  we  shall  introduce  an  estimator  defined  by  (2.19) 
and  (2.20),  where  f(0)  is  a  uniform  density  function  for  the  interval, 
(0,0)  .  Let  denote  this  Bayes  estimate  with  the  uniform  prior. 

We  can  write  from  (2.19)  and  (2.20)  that 


(2.21) 


e  P,,(6)  de  f 


By ( o )  d  9 ] 


-1 


Note  that  (2.21)  includes  only  the  conditional  probability  of  response 
pattern  V  ,  given  ability  0  ,  and,  therefore,  is  population-free. 

Figure  2-6  presents  the  regression  of  on  0  »  which  is 

obtained  by  replacing  0*  by  in  (2.16),  and  the  vertical  interval 

of  the  standard  error  of  estimate,  which  is  obtained  by  replacing  6* 
by  in  (2.17),  by  solid  and  dotted  lines,  respectively.  As  was 

expected,  this  set  of  results  is  similar  to  the  one  obtained  upon  the 
modified  maximum  likelihood  estimate,  6*  ,  which  is  shown  as  Figure  2-3. 
A  close  observation  reveals,  however,  that  the  interval  of  9  for  which 
the  vertical  interval  of  the  standard  error  of  estimate  includes  the 
unbiasedness  line  is  somewhat  smaller,  i.e.,  (-1.74,  1.76)  ,  compared 
with  (-2.08,  2.06)  for  the  modified  maximum  likelihood  estimate,  although 


REGRESSION 


this  interval  is  substantially  larger  than  the  intervals  found  for 

the  two  Bayesian  estimates.  It  is  also  noted  by  comparing  Figures  2-3 

and  2-6  that,  for  the  entire  interval  of  (0,8)  ,  the  regression  of 

the  modified  maximum  likelihood  estimate,  6*  ,  tends  to  be  closer 

to  the  unbiasedness  line  than  the  regression  of  p*y  .  The  values  of 

for  the  two  extreme  response  patterns,  V-min  and  V-max  ,  turned 

out  to  be  more  conservative  than  those  of  0*  .  and  0*  , 

V-min  V-max 

i.e.,  -1.6515  vs.  -1.9254  and  1.6430  vs.  1.8923  ,  respectively. 

The  four  sets  of  estimates  for  the  total  one  hundred  and  twenty- 
eight  response  patterns  of  LIS-U  are  shown  in  Appendix,  as  Table  A-l . 

They  are  also  pairwisely  plotted  in  six  scatter  diagrams,  and  presented 
as  Figures  2-7  through  2-12.  As  is  expected,  the  scatter  diagram  of 
plotted  against  the  modified  maximum  likelihood  estimate,  8*  , 
and  that  of  the  Bayes  modal  estimate,  0^  ,  plotted  against  the  Bayes 
estimate,  p^.  ,  which  are  shown  as  Figures  2-7  and  2-12,  respectively, 
are  almost  on  the  line  with  forty-five  degrees  from  the  abscissa,  while 
in  the  other  four  combinations  of  estimates  they  are  consistently  and 
substantially  deviated  from  this  line.  It  is  interesting  to  note  that, 
in  Figure  2-7,  for  all  the  other  one  hundred  and  twenty-six  response 
patterns  excluding  V-min  and  V-max  ,  the  values  of  0*  and  p*^ 
are  very  close  to  each  other,  while  for  these  two  extreme  response 
patterns  the  latter  are  substantially  smaller  in  absolute  values  than 
the  former. 

From  these  results,  we  can  say  that  the  modified  maximum  likelihood 


FIGURE  2-7 


Bayes  Estimate  with  the  Uniform  Prior  ,  ,  Plotted 

against  the  Modified  Maximum  Likelihood  Estimate,  6*  , 

for  the  One  Hundred  and  Twenty-eight  Possible 
Response  Patterns  of  LTS-U. 


FIGURE  2-8 


Hayes  Modal  Estimate,  0^  ,  with  the  Prior  n(0,l) 

against  the  Modified  Maximum  Likelihood  Estimate 

for  the  One  Hundred  and  Twenty-eight  Possibl 
Response  Patterns  of  LIS-U. 


with  the  Prior  n(0,l)  ,  Plotted 


Bayes  Estimate  ,  „jv  , 

against  the  Modified  Maximum  Likelihood  Estimate,  0*  , 

for  the  One  Hundred  and  Twenty-eight  Possible 
Response  Patterns  of  LIS-U. 


FIGURE  2-10 


Bayes  Modal  Estimate,  0^  ,  with  the  Prior  n(0,l)  ,  Plotted 

against  the  Bayes  Estimate  with  the  Uniform  Prior,  u*y  , 

for  the  One  Hundred  and  Twenty-eight  Possible 
Response  Patterns  of  LIS-U. 


FIGURE  2-11 


Bayes  Estimate,  ,  with  the  Prior  n(0,l)  ,  Plotted 

against  the  Bayes  Estimate  with  the  Uniform  Prior,  p*y  , 

for  the  One  Hundred  and  Twenty-eight  Possible 
Response  Patterns  of  LTS-U. 


I 

estimator,  0*  ,  provides  us  with  a  better  approximation  to  the  unbiasedness 
and  is  a  better  estimator  than  ,  although  the  latter  is  also  population- 

free  and  is  a  much  better  estimator  than  the  Bayesian  types  of  estimators 
in  satisfying  the  principle  of  objective  testing. 
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III  Sample  Statistic  Versions  of  the  Alternative  Estimators  for  the 
Two  Extreme  Response  Patterns 

The  introduction  of  the  two  alternative  estimators  for  V-min 

and  V-max  and  the  resultant  modified  maximum  likelihood  estimate,  6*  , 

has  enhanced  the  usefulness  of  relatively  short  and  less  informative 

tests,  without  sacrificing  the  objectivity  of  testing.  When  the  number 

of  items  is  as  small  as  seven  and  all  items  are  binary  items,  as  is 

the  case  with  LIS-U  ,  the  computation  of  6$  •  and  0*  is 

r  V-min  V-max 

relatively  easy,  owing  to  the  fact  that  the  number  of  all  possible 
response  patterns  is  as  small  as  128  .  Note,  however,  that  the  increase 
in  the  number  of  items,  and/or  in  the  number  of  item  scores  for  each 
item,  will  soon  make  it  practically  impossible  to  compute  these  two 
substitute  estimates,  since  the  number  of  all  possible  response 
patterns  will  increase  by  gigantic  steps.  For  example,  if  a  test  has 
ten  binary  items  instead  of  seven,  the  number  of  all  possible  response 
patterns  will  be  1,024  ;  if  a  test  has  seven  three-item-score-category 
items,  the  number  of  all  possible  response  patterns  will  be  2,187  ; 
if  a  test  has  fifteen  three-item-score-category  items,  it  will  be  as 
large  as  14,348,907  .' 

For  the  reason  described  in  the  preceding  paragraph,  it  is 
necessary  that  we  should  invent  some  device  in  dealing  with  the 
situation  in  which  the  number  of  all  possible  response  patterns  is 
too  large  for  us  to  compute  0*  .  and  0*  directly.  By 

virtue  of  the  availability  of  electronic  computers  and  the  Monte 
Carlo  method,  this  can  be  done  by  introducing  the  sample  statistic 
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versions  of  the  two  estimators. 

Let  N  be  the  number  of  examinees  who  were  selected  randomly 
from  the  uniform  distribution  for  the  interval  of  0  ,  (0,9)  .  Let 
N.  denote  the  number  of  examinees  who  belong  to  the  above  sample  and 
whose  levels  of  ability  are  lower  than  the  critical  value  9  ,  and 

N  be  of  that  of  those  whose  ability  levels  are  higher  than,  or  equal 

n 

to,  9  .  Thus  we  can  write 

c 

(3.1)  N  =  N  +  N  . 

L  H 


Let  N  and  N  denote  the  numbers  of  examinees  who  obtained  the 
LV  HV 

response  pattern  V  ,  in  the  above  two  subgroups  of  the  sample. 


respectively.  Thus  we  have 

f 


(3.2)  < 


nl=  $  nlv 


N. 


Z  N 


HV  ' 


re 


It  can  be  seen  that  the  sample  statistic  corresponding  to 


Pv(6)  de 


in  the  formula  (2.14)  is  N  (G  -  0)  N  ,  and  also  the  one  for 

Lv  c  -  L 

P„(9)  d0  is  N„tt  (0  -  0  )  N  ^  .  Substituting  these  sample 
V  HV  c  H 


statistics  into  (2.14)  and  rearranging,  we  obtain  ®v_min  ar>d 


V-max 


such  that 
! 


(3.3)  < 


<■*  .  -  |-,(o  -t  0)  N,  -  >'  0  N  I  N 

V-nun  c  -  I,  ,,  ,,,  .  V  LV  LV 

VfV-min 


vyv-mnx 


0  * 

V-max 


-1 

-min 

-1 


|-(,<  t  e  )  n  -  ON  IN 

r  II  .  V  I IV  1  IlV-max  , 

V?\ -mi n 

V^ V-max 
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where  N,  „  .  and  N  are  the  numbers  of  examinees  who  belong 

LV-min  HV-max  b 

to  the  lower  subgroup  and  obtained  the  response  pattern  V-min  ,  and 
those  who  belong  to  the  upper  subgroup  and  obtained  the  response 
pattern  V-max  ,  respectively. 

It  can  be  seen  that  9*  _  and  G*  ,  which  were  defined 

v-mm  V-max 

in  the  preceding  paragraph,  are  consistent,  or  converge  in  probability 

to  9*  .  and  6*  ,  respectivelv,  as  the  sample  sizes  increase. 

V-min  V-max  1  '  ’  r 

In  other  words,  if  N,  ,  N„  ,  N  ,  .  and  N„„  are  large  enough, 

’  L  H  LV-min  HV-max  °  & 

the  probabilities  with  which  0*  ancj  0*  assume  values  within 

V-mm  V-max 

the  vicinities  of  0*  .  and  *  ,  respectively,  will  be  verv 

V-min  V-max 

high.  Although  the  two  numbers,  N, ,,  .  and  N,„,  ,  also 

^  ’  hV-min  HV-max  ’ 

depend  upon  the  choice  of  the  interval,  (t‘,0)  ,  by  virtue  of  the 

Monte  Carlo  method,  we  can  control  the  two  sample  sizes,  N  and  Nu  , 

L  H 

as  we  wish. 

A  procedure  with  which  we  may  obtain  §*  .  and  6* 

r  1  V-min  V-max  ' 

which  are  defined  by  (3.3),  can  be  summarized  as  follows. 

(1)  Determine  the  interval,  (0,0)  . 

(2)  Obtain  the  critical  value,  t1  . 

(3)  Determine  the  sample  size,  N  ,  which  makes  both  and 

large  enough  for  our  purpose. 

(4)  Produce  the  ability  levels  of  these  N  hypothetical  subjects 
from  the  uniform  distribution  for  the  interval,  (p,6)  .  This 
can  be  done  either  by  the  Monte  Carlo  method,  or  by  placing 
the  N  examinees  at  equally  spaced  points  in  the  entire 
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interval,  (  ,  or  using  one  ol  its  variations. 

(5)  Calibrate  by  the  Monte  Carlo  method  a  response  pattern  for 
each  of  the  N  hypothetical  examinees  with  respect  to  the 
n  test  items  of  our  test. 


(6)  Find  out  the  two  frequencies,  N  and  N  ,  for  each  response 

LV  HV 

pattern  V  . 

(7)  Obtain  the  maximum  likelihood  estimate  p  for  each  response 

pattern  whose  frequencies,  N  ^  and  ,  are  not  both  zero, 

excluding  V-min  and  V-max  . 

(8)  Use  the  above  results  in  (3.3),  and  compute  .  and  * 

1  V-mxn  V-max 


Note  that  the  probabilities  with  which  we  obtain  positive  frequencies 
for  N  and  N  .  are  both  negligible-  small,  and  this 

fact  can  be  used  as  a  checking  process. 

For  the  purpose  of  illustration,  we  shall  use  the  five  hundred 
hypothetical  examinees,  who  have  been  used  repeatedly  in  our  previous 
studies  of  estimating  the  operating  characteristics  of  the  discrete 
item  responses  (Samejima,  1977c,  RR-77-1 ,  RR-78-1,  RR-78-2,  RR-78-3, 
RR-78-4,  RR-78-5,  RR-78-(>,  RR-80-2,  RR-80-4),  and  Subtest  3,  which 
is  one  of  the  sub  I  eats  from  t ho  Old  Test  of  thirty-five  test  items 
wiLh  throe  item  score  categories  for  each  item.  These  hypothetical 
examinee:;  are  placed  at  the  one  hundred  points  of  ability  0  ,  which 
start  from  -2.475  and  equally  spaced  by  the  distance,  0.05  ,  with 
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five  examinees  positioned  together  at  each  point.  For  this  reason, 
they  can  be  considered  as  a  representative  sample  from  the  uniformly 
distributed  population  for  the  interval  of  0  ,  (-2.5,  2.5)  .  These 
fifteen  test  items  of  Subtest  3  follow  the  normal  ogive  model,  whose 
operating  characteristic  is  given  by  (2.2)  .  The  item  discrimination 

parameter,  a  ,  and  the  two  item  response  difficulty  parameters,  b 

8  Xg 

for  x  =1  and  x  =2  ,  of  these  items  of  Subtest  3  are  presented  in 
g  g 

Table  3-1.  The  square  root  of  the  test  information  function  of  Subtest  3, 

which  is  computed  through  (2.12),  is  drawn  by  a  solid  line  in  Figure  3-1. 

We  can  see  that  the  function  is  uni-modal,  which  means  that  Subtest  3 

is  more  informative  around  the  middle  of  the  interval  of  6  ,  (-2.5,  2.5), 

and  less  informative  as  0  diverts  from  th^  middle.  Figure  3-2 

presents  the  two  operating  characteristics,  Py  ^(0)  and 

fV-max^^  ’  by  solid  and  dotted  lines,  respectively,  together  with 

the  critical  value,  0^  ,  which  equals  -0.4146  .  Unlike  the  one 

we  used  for  LIS-U  in  the  preceding  chapter,  this  value  does  not 

make  the  product  of  the  two  operating  characteristics  maximal,  but 

is  more  deviated  toward  the  negative  side.  And  yet  both  P,r  .  (0) 

and  Plr  (8)  are  practically  zero  at  this  point  of  0  . 

V-max  r  j  r 

Since  the  response  pattern  for  each  of  the  five  hundred  hypothetical 
examinees  with  respect  to  the  Old  Test  was  already  calibrated  (Samejima, 
1977c),  we  have  used  the  subset  of  this  response  pattern  for  the  fifteen 
items  of  Subtest  3.  The  maximum  likelihood  estimate  for  each  response 


pattern  was  obtained  by  using  the  basic  functions,  Ax  (0)  ,  and  (2.6). 

'  g 

It  turned  out  that  only  fourteen  examinees  out  ot  t  ive  hundred  obtained 
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TABLK  3-1 


Item  Discrimination  Parameter,  a 

K 

for 


Difficulty  Parameters,  h 


x„ 


,  and  Item  Response 

x  =  1  and  x  =  2  , 
8  8 


for  Each  of  the  Thirty-Five  Test  Items  of  the  Old  Test. 
The  Fifteen  Test  Items  Which  Belong  to  Subtest  3  Are 
Marked  with  Crosses. 


I  tern  g 

a 

8 

b2 

Subtest  3 

1 

1.8 

m 

-3.75 

2 

1.9 

SeK9 

-3.50 

3 

2.0 

-4.25 

-3.25 

4 

1.5 

-4.00 

-3. 00 

5 

1.6 

-3.75 

-2.75 

6 

1.4 

-3.5  0 

-2.50 

7 

1.9 

-3. 00 

-2 . 00 

8 

1.8 

-3.  00 

-2.00 

9 

1.6 

-2.75 

-1.75 

10 

2.0 

-2.5  0 

-1.50 

11 

1.5 

-2.25 

-1.25 

X 

12 

1.7 

-2.00 

-1.00 

X 

13 

1.5 

-1.75 

-0.75 

X 

14 

1.4 

-1.50 

-0.50 

X 

15 

2.0 

-1.25 

-0.25 

X 

16 

1.6 

-1.00 

0.00 

X 

17 

1.8 

-0.75 

0.25 

X 

18 

an 

-0.5  0 

0.50 

X 

19 

mjm 

-0.25 

0.75 

X 

20 

mSm 

0.00 

1.00 

X 

21 

1.5 

0.25 

1.25 

X 

22 

1.8 

0.50 

1.50 

X 

23 

1.4 

0.75 

1.75 

X 

24 

1.9 

1.00 

2.00 

X 

25 

2.0 

1  .25 

2  .  ?  5 

X 

26 

1.6 

1.50 

2.50 

27 

1.7 

1.75 

2.75 

28 

1.4 

2.00 

3. 00 

29 

1.9 

2.25 

3.25 

30 

1.6 

2.50 

3.50 

31 

2.75 

3.75 

32 

3.00 

4.00 

33 

3.25 

4.25 

34 

3.50 

4.50 

35 

3.75 

4.75 

Iquare  Root  of  Test  Information  Function  (Solid  Line) ,  and  Its 
Approximation  by  the  Polynomial  of  Degree  7  (Dotted  Line),  of 

Subtest  3. 


Position  of  the  Critical  Value, 
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the  response  pattern,  V-min  ,  and  twelve  obtained  the  response  pattern, 
V-max  .  These  relatively  small  numbers  of  examinees  who  obtained 
either  one  of  the  two  extreme  response  patterns  indicate  that,  for 
Subtest  3,  the  interval,  (-2.5,  2.5)  ,  is  still  too  conservative  to  use 
as  (6,0)  ,  and  it  may  be  expanded  further.  The  two  sample  sizes, 

and  ,  are  210  and  290  ,  respectively.  Substituting  these  values, 

together  with  N  ,  .  =14,N  =12  6=-’  5  o=?s  anH 

LV-min  ’  HV-max  *  -  -.3  ,  »  -  ,  and 

0  =-0.4146  ,  into  (3.3),  we  obtained  0*  .  =  -2.31453  and 

c  V-mxn 

=  2.04027  . 

V-max 


Let  6*  denote  a  new  estimator,  which  is  defined  by 


(3.4) 


9* 

V 


=  6  * 

V-min 

=  6* 
V-nax 


for  V  =  V-min 
for  V  =  V-max 


l  =  §v  otherwise, 

as  distinct  from  G*  ,  which  is  defined  by  (2.15).  Figure  3-3  presents 
the  scatter  diagram  of  the  five  hundred  hypothetical  examinees  with 
respect  to  their  ability  levels,  0  ,  and  the  estimate,  6*  ,  l*'e 

notice  that,  for  fixed  values  of  0  ,  the  values  of  0*  scatter  more 
widely  as  0  departs  from  the  middle  of  the  interval,  (-2.5,  2.5)  , 
but  then  start  having  truncated  distributions  as  0  approaches  either 
9  or  0  .  For  the  purpose  of  comparison,  Figure  3-4  presents  the 
corresponding  scatter  diagram  of  the  same  five  hundred  hypothetical 
examinees,  with  the  maximum  likelihood  estimate  0^,  ,  which  was  based 

upon  the  original  Old  Test,  on  the  ordinate.  In  this  figure,  the  values 
of  the  maximum  likelihood  estimate  for  fixed  ability  levels  scatter 
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FIGURE  3-4 


Scatter  Diagram  of  the  Maximum  Likelihood  Estimate  0 

V 

and  Ability  0  for  the  Five  Hundred  Hypothetical 
Examinees,  Based  Upon  the  Original  Old  Test. 
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approximately  within  the  same  range  for  the  entire  interval  of  0  , 
reflecting  the  fact  that  the  test  information  function  of  the  Old  Test 
assumes  approximately  the  same  value  throughout  the  interval  of  0  in 
question. 

The  sample  linear  regression  of  $*  on  ability  0  f  or  the 
best  fitted  linear  function  of  O  which  makes  the  sum  total  of  the 
squared  discrepancies  of  0*  minimal,  turned  out  to  be 


(3.5) 


0.995460  -  0.00730  . 


This  function  of  0  is  shown  ns  the  straight  line  in  Figure  3-3,  which 
is  practically  indistinguishable  from  the  unbiasedness  line,  or  the 
line  with  forty-five  degrees  from  the  abscissa  which  passes  the  origin 
of  the  two  axes,  (0,0)  .  The  corresponding  sample  linear  regression 
for  the  scatter  diagram,  which  is  based  upon  the  original  Old  Test 
and  shown  in  Figure  3-4,  proved  to  be  1.0040  -  0.006  (Samejima, 
1977c).  We  can  say  that  these  two  results  are  practically  the  same. 
The  sample  regression  of  6*  on  ability  6  was  obtained  for  the 
one  hundred  ability  levels,  and  is  shown  in  Appendix  as  Figure  A-l. 

The  mean  and  variance  of  0  for  the  five  hundred  hypothetical 
examinees  are  0.0000  and  2.0831  ,  and  those  of  0*  turned  out  to 
he  -0.0073  and  2.1290  ,  respectively.  The  product-moment 
correlation  coefficient  between  0  and  0*  for  the  five  hundrsd 
hypothetical  examinees  is  found  to  be  0.985  . 

Since  we  have  for  the  uniform  distribution 


I 
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(3.6) 


E(0) 


0  (0  -  0)  1  d0  =  (0  +  0)/2 

0 


and 


(3.7) 


Var. (0) 


[0  -  E(0 ) ] 2  (0  -  0)"1  d0 


(6  -  0)  / 3  2  , 


in  the  present  case  of  0  =  -2.5  and  0  =  2.5  ,  we  obtain 


(3.8)  E(0 )  =  0.0000 


and 


(3.9)  Var. (0)  =  2.0833  . 

As  is  expected  from  the  way  we  produced  the  ability  levels  of  our 
five  hundred  hypothetical  examinees,  the  above  sample  mean  and 
variance  are  practically  the  same  as  the  population  mean  and 
variance . 

When  an  estimator,  A  ,  of  ability  0  is  conditionally 
unbiased,  or  we  can  write 

(3.10)  E(X 1 0)  =  6  , 
the  relationship  such  that 

(3.11)  K(\)  =  K(0) 


holds  in  general,  regardless  of  the  distribution  of  0 


In  such  a 
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case,  the  variance  of  X  is  found  out  to  be 

(3.12)  Var.(X)  =  Var.(Q)  +  E[Var.(x|9)]  >  Var.(O) 

(cf.  Samejima,  1977c),  and  the  product -moment  correlation  coefficient 
between  0  and  '  is  given  by 

-1  1/2 

(3.13)  Corr . (n , X)  =  [l-E{Var. (X |e) }  (Var.(X)}  ]  . 

The  fact  that  the  discrepancy  of  the  mean  of  6*  for  our  five 
hundred  hypothetical  examinees  from  the  expectation  of  9  is  less 
than  0.001  supports  this  estimate  for  being  a  X  ,  the  unbiased 
estimate  of  0  .  Since  the  maximum  likelihood  estimate,  6^  ,  has  a 
characteristic  that  for  a  fixed  value  of  9  ,  it  asymptotically 
distributes  normally  with  0  and  fl(0)]_^“  as  ^be  two  Paranieters, 
as  the  amount  of  test  information  tends  to  infinity,  and  this 
convergence  is  relatively  fast  (Samejima,  1975,  1977a,  1977b), 

E[Var . (9^ | 9) ]  can  be  approximated  by  E[I(0)  ]  for  the  interval, 

(6,9)  .  For  Subtest  3,  we  find  that 

(3.14)  E[I(0)_1]  4  0.0803  . 

From  the  above  result  we  can  see  that  the  discrepancy  of  the  variance 
of  the  modified  maximum  likelihood  estimate,  ■''*  ,  for  our  sample 
from  the  populat  ion  variance  of  9  to  be  0.0457  ,  which  is  loss 
than  h[l(9)  |  given  as  (3.14)  .  If  wo  substitute  (3.14)  for 

F,1  Var .  (1  1 0)  '•  and  ■■  for  1  in  (3.12),  we  obtain  for  Subtest  3 

(3.15)  Var.(9v)  =  2.1636  . 
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Substituting  (3.14)  and  (3.15)  into  (3.13),  we  obtain  for  the  product- 
moment  correlation  coefficient  between  9  and  the  maximum  likelihood 
estimate  9^  , 

(3.16)  Corr.(0,9v)  =  0.981 

It  is  interesting  to  note  that  our  sample  variance  of  the  modified 

maximum  likelihood  estimate,  9*  ,  is  slightly  less  than  the  estimated 

population  variance  of  the  maximum  likelihood  estimate,  0^  ,  and  our 

sample  correlation  coefficient  between  0  and  6*  is  slightly  greater 

than  the  estimated  population  correlation  coefficient  between  6  and 

the  maximum  likelihood  estimate  fly  . 

The  error  score,  e  ,  for  each  individual  examinee  s  is 
s 

defined  by 

(3.17)  eg  =  [6$  ~  es][I(0s)]"1/2  , 

s 

where  9  is  the  ability  level  of  the  examinee  s  ,  and  V  indicates 

s 

the  response  pattern  obtained  by  the  examinee  s  .  Note  in  this 
definition  of  the  error  score  the  discrepancy  of  the  estimated  ability 
from  the  true  ability  is  divided  by  the  reciprocal  of  the  square 
root  of  the  amount  of  test  information.  Thus  *  if  0*  distributes, 
approximately,  normally  with  the  true  ability  9  and  the  reciprocal 
of  the  square  root  of  the  test  information  function  as  the  two 

parameters,  then  the  error  score  e  will  conditionally  distribute. 

s 

approximately,  normally  with  zero  and  unity  as  its  two  parameters, 
regardless  of  the  ability  level  0 
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Figure  3-5  presents  the  cumulative  frequency  function  of  the 
error  score  e  for  the  five  hundred  hypothetical  examinees,  which 
was  obtained  upon  Subtest  3,  together  with  the  distribution  function 
of  the  standard  normal  distribution.  We  can  see  in  this  figure  that 
this  cumulative  frequency  function  is  fairly  close  to  the  standard 
normal  distribution  function,  the  result  which  supports,  though 
weakly,  the  approximate  normality  for  the  conditional  distribution 
of  e  ,  given  .  The  corresponding  cumulative  frequency  function 

of  eg  ,  which  was  obtained  upon  the  original  Old  Test,  is  presented 
in  Figure  3-6,  together  with  the  standard  normal  distribution  function. 
Comparison  of  these  two  results  reveals  that  the  two  cumulative 
frequency  functions  are  very  similar  to  each  other,  regardless  of 
the  fact  that  the  former  error  score  is  defined  for  the  modified 
maximum  likelihood  estimate  6*  ,  which  includes  twenty-six  substitute 
estimates  for  negative  and  positive  infinities,  and  the  latter  is  with 
respect  to  the  maximum  likelihood  estimate  p  itself,  and  that  the 
square  root  of  the  test  information  function  of  Subtest  3  is  unimodal 
while  that  of  the  original  Old  Test  is  constant  (=  4.65)  .  The  mean 
of  the  error  score  for  Subtest  3  turned  out  to  be  -0.025  and  the 
one  for  the  original  Old  Test  is  -0.027  ,  both  of  which  are  close  to 
zero.  The  standard  deviation,  or  the  estimated  second  parameter, 
proved  to  be  0.972  for  Subtost  3,  and  0.995  for  the  original  Old 
Test,  in  comparison  with  unitv  for  the  standard  normal  distribution. 
Tin*  variance  ol  i ho  error  score  is  0.944  lor  Subtest  3,  and  the  one 
lor  the  original  Old  lost  is  0.99!)  .  Il  is  interesting  to  note  that 
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the  error  score  for  Subtest  3  has  a  less  dispersion  than  the  one  for 
the  original  Old  Test.  This  fact  is  inconsistent  with  the  finding  in 
the  preceding  chapter  about  the  reduction  of  the  standard  error  of 
estimation  for  6*  compared  with  the  the  reciprocal  of  the  square 
root  of  the  test  information  function  of  L1S-U  . 

The  sample  linear  regression  of  the  error  score  eg  for 
Subtest  3  on  ability  0  turned  out  to  be  0.00342*3  +  0.00009  , 
which  is  practically  indistinguishable  from  the  abscissa.  This  is 
even  stronger  support  for  the  independence  of  the  error  score  and 
ability  0  than  the  result  for  the  original  Old  Test,  whose  sample 
linear  regression  turned  out  to  be  0.045796  +  0.00124  . 

The  frequency  distribution  of  the  five  hundred  error  scores 
was  obtained  using  the  category  width  of  0.2  ,  for  both  Subtest  3 
and  the  original  Old  Test.  Figures  3-7  and  3-8  present  these  two 
results  in  the  form  of  histogram,  together  with  the  standard  normal 
density  function.  The  chi-square  test  for  the  goodness  of  fit  was 
made  for  each  histogram  against  the  standard  normal  density  function, 
by  combining  all  the  categories  below  e  =  -2.8  and  those 
above  e  =  2.8  into  single  categories,  respectively.  It  turned 
out  that  =  35. 2252  with  29  degrees  of  freedom,  which 

-i 

provides  us  with  .30  p  -  .30  ,  lor  Subtest  3,  and  =  34.2248 

with  29  degrees  ol  I reedom,  which  a  iso  gives  us  .20  '  p  <  .30  , 
for  the  original  Old  Test  ,  respectively.  I'll  this  process  ol  the  chi- 
square  Lest,  there  are  fourteen  categories  for  which  the  theoretical 


frequencies  are  less  than  ten.  If  we  combine  them  appropriately 
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so  that  all  the  frequencies  should  become  greater  than,  or  equal  to, 
ten,  then  the  fits  will.be  even  bettor. 

In  the  preceding  chapter,  we  introduced  and  defined  by  (2.21) 
another  population-free  estimator,  y*^  .  We  notice  that,  although 
it  is  practically  impossible  to  compute  the  values  of  this  estimate 
when  the  number  of  possible  response  patterns  is  too  large,  as  is  the 
case  with  Subtest  3,  we  can  compute  these  values  for  the  two  extreme 
response  patterns,  V-min  and  V-max  ,  and  used  them  as  substitutes 
for  negative  and  positive  infinities  of  the  maximum  likelihood 


estimate . 


V-max 


L’he  rationale  behind  these  two  estimates,  y*'  .  and 

lV-min 

,  may  be  given  as  follows.  Let  0**  be  an  unknown  estimator. 


which  makes  the  integral  of  the  conditional  expectation  of  the  squared 
error  of  estimation,  given  0  ,  for  the  interval,  (9,9)  »  minimum. 

We  define  Q  such  that 


(3.18) 


q  =  e[ (e**  -  e)  ! e ]  de 

i  0_ 


V  /  n 


(nV*  '  fl)  pv(|,)  d0 


Differentiating  (J.L8)  with  respect  to  0**  and  setting  the  result  equal 


to  zero,  we  obtain 


(3.1')) 


ti**  = 

V 


i*v<  )  <1  ’  ij  i\r(  >)  do)' 


i r i-i 4 
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Thus  we  have  found  that  the  estimator  V**y  is  the  one  which  makes 
the  integral  of  the  conditional  expectation  of  the  squared  error  of 
estimation,  given  0  ,  for  the  interval,  ('<,  >'•)  ,  minimum.  Note 
that  (3.19)  includes  no  other  response  patterns,  and  is  given  as  a 
function  of  V  and  the  interval,  (9,  0)  only.  This  implies  that 
can  be  used  as  the  estimate  for  a  specific  response  pattern 
when  those  for  any  other  response  patterns  are  already  given.  For 
example,  if  we  define  0***  in  such  a  way  that 

/ 

=  0*** .  for  V  =  V-min 

V-min 

(3.20)  0***  <  =  0***  for  V  =  V-max 

V  V-max 

=  0^  otherwise  , 


and  search  for  9***.  and  0***  following  the  above  principle, 
V-min  V-max 


then  we  will  obtain 


(3.21) 


V-min 


p  r° 

J  ,  0  PV-mln(9)  d9  <  J  V»i„<0)  d9) 


- 


lV-min 


and 


(3.22) 


e 


*** 

V-max 


0  P 


V-max 


(0) 


do 


=  V 


*  ' 

IV -max 


del 


-l 


This  is  exactly  the  case  with  the  present  situation,  which  provides 

us  with  the  justification  for  using  nt'  .  and  for  the 

lV-mm  lV-max 

two  extreme  response  patterns,  V-min  and  V-max  ,  which  substitute 
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for  the  negative  and  positive  infinities,  respectively,  of  the 
maximum  likelihood  estimate.  An  advantage  of  tiiese  two  estimates. 


p?,',  and  p* '  ,  over  ft*  .  and  0*  ,  is  that  they  are 

lv-mm  lV-max  V-min  V-max  J 

theoretical  values,  and  obtainable  without  depending  upon  the  Monte 


Carlo  method.  Note,  however,  that  no  consideration  for  the  conditional 


unbiasedness  of  the  estimator  is  given  in  adopting  P * '  .  and 

r  lV-min 

11  lV-max  *  comPllted  these  values  for  Subtest  3  using  the  interval 

of  0  ,  (-2.5,  2.5)  ,  and  they  turned  out  to  be  -2.2684  and 

2.2884  ,  respectively.  Comparison  of  these  values  with  9* 


(-  -2.3145)  and  0*  (=  2.0403)  reveals  that  they  are  not  too 

V-max  3 

far  from  each  other. 


It  should  be  noted  that  the  above  values  of  0*  ,  and  9* 

V-min  V-max 

are  not  the  least  and  greatest  values  of  0*  .  In  fact,  among  the 
four  hundred  and  seventy-four  finite  values  of  the  maximum  likelihood 


estimate,  we  find  such  values  as  -2.4698  (8),  -2.3887  (5), 

-2.3846  (2)  and  -2.3585  (6),  which  are  less  than  -2.3145  ,  and 

2.4651  (7),  2.3526  (12),  2.3454  (7),  2.3359  (2),  2.2762  (3), 

2.0885  (5)  and  2.0789  (3),  which  are  greater  than  2.0403  , 

with  the  integer  attached  to  each  value  in  parenthesis  indicating  the 

corresponding  frequency.  Considering  that  fact  that  9*  .  and 

V-min 

®V-max  substitute  for  negative  and  positive  infinities  of  the 
maximum  likelihood  estimate,  it  may  be  hard  to  accept  these  values. 
This  tact  indicates  that  the  effectiveness  of  Subtest  3  as  an 


instrument  for  measuring  ability  ft  extends  itself  for  a  greater 
interval  of  ft  than  (-2.5,  2.5)  .  from  the  definition  of  the 
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substitute  estimates,  0*  ,  and  0*  and  also  from  the  findings 

V-min  V-max 

in  the  preceding  chapter  with  respect  to  (•*  .  and  e*  , 
we  can  expect  to  be  able  to  find  an  optimal  interval  of  8  for 
which  the  modified  maximum  likelihood  estimate,  6*  ,  is,  approximately, 
conditionally  unbiased,  with  0*  .  and  0*  being  the  least 

and  the  greatest  values  of  the  estimate  6*  . 
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i 

i 

IV  Modified  Maximum  Likel ihood  Estimate  for  the  Transformed  Latent 
Trait 

We  have  seen  in  the  preceding  chapter  that,  for  Subtest  3, 

the  two  alternative  estimates,  0*  .  and  0*  ,  for  the  two 

extreme  response  patterns  may  be  too  small  in  absolute  values,  if 

we  take  the  interval  of  t)  t  (-2.5,  2.5)  ,  for  (3,0),  to  accept 

as  the  substitutes  for  the  negative  and  positive  infinities  of  the 

maximum  likelihood  estimate.  The  logical  step  we  should  take  next 

will,  therefore,  be  to  search  for  an  optimal  interval  for  (0 , 0 ) 

for  this  purpose.  This  can  he  done  by  expanding  the  interval  as 

far  as  possible  in  hoth  negative  and  positive  directions,  with 

the  restriction  that  the  resultant  0*  .  and  0*  provide 

V-mm  V-max 

us  with  an,  approximately,  conditionally  unbiased  estimator  3*  for 
that  interval  of  0  . 

We  notice,  however,  that  we  need  a  transformation  of  0  to 
r  in  order  to  use  a  test  like  Subtest  3,  which  does  not  have  a 
constant  test  information  function,  as  the  Old  Test  in  our  methods 
of  estimating  the  operating  characteristics  of  discrete  item  responses 
(Samejima,  RR-80-2,  RR-80-4).  The  search  for  the  alternative 
estimates  for  the  two  extreme  response  patterns,  V-min  and  V-max  , 
will,  therefore,  become  more  meaningful  if  we  do  it  with  respect  to 
t  ,  which  provides  us  with  a  constant  test  information  function, 

I*(r)  ,  for  Subtc.st  3  . 

Let  P*(t)  be  the  operating  characteristic  of  the  response 
pattern  V  ,  which  is  defined  as  a  function  of  the  transformed 


latent  trait  t  .  This  conditional  probability,  given  ability  9  , 
stays  the  same  as  the  original  operating  characteristic,  Py(9)  >  as  ^ar 
as  t  is  a  strictly  increasing,  or  decreasing,  function  of  9  .  Thus 
we  can  write 


(4.1) 


p*(0  =  pv(o) 


Let  t*  .  and  t*  denote  the  estimates  of  t  which  are 

V-min  V-max 

analoguous  to  9*  .  and  0*  defined  bv  (2.14)  . 

V-mm  V-max 

We  can  write 


V-min 


(4.2) 


T* 

V-max 


- 12)  -  V,T  .  M  ' P?<T>  d,) 

V^V-min  J  t 
V*/ V-max 


P*  _,(x)  dx] 

V-min 


[I(?2-Tc2)  ", =  .  \Jr  PV(T)dT] 

V*V-min  '  c 


V/ V-max 


[  P*  (O  dr]' 

/  V-max 

J  Tc 


where  x  and  x  indicate  the  lower  and  upper  endpoints  of  an 

appropriately  selected  interval  of  x  ,  x^  is  a  critical  value  of 

x  below  which  P*  (x)  assumes  negligibly  small  values,  and 
V-max 

above  which  so  does  P*  .  (r)  ,  and  x„  is  the  maximum  likelihood 

V-mjn  V 

estimate  of  t  which  is  assigned  to  the  response  pattern  V  .  Let 

us  assume  that  the  first  throe  values,  i  ,  i  and  x  ,  are 

-  c 
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directly  transformed  from  0  ,  o  and  0  ,  respectively,  through 

the  strictly  increasing  transformation 


(4.3)  x  =  t(0)  =  Z  a*  9 

k=0 


for  -4.0  (  6  (  4.0  , 


whose  nine  coefficients  are  given  in  Table  4-2  .  The  critical  value, 
.which  was  transformed  from  (=  -0.4146)  through  (4.3), 

turned  out  to  be  -0.5455  .  Again,  this  value  of  T  does  not  make 
the  product  of  the  two  operating  characteristics,  P*  .  (t)  and 
P*_maX(T)  »  maximal,  but  is  deviated  toward  the  negative  side.  It 
should  be  noted  that  the  maximum  likelihood  estimate,  ,  of  the 

transformed  ability  T  is  also  given  as  the  direct  transformation  of 
§v  (Samejima,  1969)  through  (4.3),  for  every  response  pattern  V  . 
Thus  we  can  rewrite  (4.2)  in  the  form. 


TV-mm  *  'l1'*6/  -  '  .  'M  PV(f,)  f  dS 

VpV-min  J 

V^V-max 


(4.4) 


C  Pv  m^(e>  de]_1 

V-mm  ay 


T\— ,  ■  I^<5>2  -  P<V2>  -  =  , 

VfV-mxn  J 

V^V-max  c 

f  PV-max(6)  d©  d‘^ 


Tt  is  obvious  I rom  (4.4)  that,  in  general,  we  have 
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TV-4 


(4.5) 


fx*  .  1  r(‘J*  .  ) 

J  V-mm  V-mm 

\  T*  f  t  (0.*  ) 

\  V-max  V-max 


The  sample  statistic  versions  of  r  *  .  and  r  *  ,  which  will  be 

r  V-min  V-max 

denoted  by  t*  .  and  t*  ,  respectively,  are  defined  by 

J  V-min  V-max 


(4.6) 


x  *  =  ri(T  +  t)  N  -  T.  tN]N  * 

V-min  l2  c  -  L  V  LVJ  LV-min 

Vf V-min 

V^V-max 


T* 

V-max 


[2<I  +  Tc>  nh  ~  1  TV  NHV5  NHV-max 

V  * V  — m  i  n 


V^V-min 
Vf V-max 


where  NT  ,  N  ,  N  ,  N  ,  N  .  and  N  „  are  as  defined 
L  H  LV  HV  LV-min  HV-max 

in  the  preceding  chapter.  From  (4.6)  and  (3.3),  it  is  obvious  that, 
again,  in  general,  we  have 


(4.7) 


( ^  t(0*  .  ) 

I  V-min  V-mm 

It*  t (fl*  ) 

^  v-max  V-max; 


We  must  make  our  choice,  therefore,  as  to  which  of  the  two  sets  of 

the  alternative  estimates  should  be  taken.  In  this  chapter,  our 

choice  is  to  take  f*  .  and  f* 

V-mm  V-max 

The  transformation  of  9  to  x  for  Subtest  3  starts  from 
the  approximation  of  the  square  root  of  the  test  information  function 
by  a  polynomial,  using  the  method  of  moments  (F.Lderton  and  Johnson, 
I960,  Johnson  and  Kotz,  1970),  as  was  done  for  some  other  subtests 
of  the  original  Old  Test  in  a  previous  study  (Samcjima,  RR-80-4). 

Note  that  such  a  polynomial  is  the  best  fitted  polynomial  of  a  given 
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degree  in  the  least  squares  principle  (Samejima  and  Livingston, 
RR-79-2) . 


The  degree  of  the  polynomial  selected  here  is  seven,  and  the 
interval  of  0  for  which  the  method  of  moments  was  applied  is 
(-4.0,  4.0)  .  The  eight  coefficients  of  the  resultant  polynomial, 

7  k 

>.  'L  0  ,  are  presented  in  Table  4-1  in  the  ascending  order  of  k  , 

k=0 

and  the  polynomial  is  shown  by  a  dotted  line  in  Figure  3-1  of  the 
preceding  chapter,  together  with  the  original  square  root  of  the 
test  information  function  of  Subtest  3.  We  can  sec  in  this 
figure  that  the  polynomial  thus  obtained  provides  us  with  an  extremely 
good  approximation  to  the  square  root  of  the  test  information  function 
of  Subtest  3.  Using  this  approximated  polynomial,  the  transformation 
of  ability  9  to  t  is  also  given  in  the  form  of  another  polynomial 
(Samejima,  RR-80-2),  such  that 

(4.8)  t(0)  =  a*  ek  , 

k=0  k 


where 


(4.9) 


°k 


(CK)_1  ol 

k-1 


k  =  0 

k  -  1 , 2 , ... ,8  , 


with  d  indicating  an  arbitrarily  set  constant,  and  C  being 

the  constant  which  the  square  root  of  the  test  information  function, 

1/2 

[ I* ( t) 1  ,  assumes  for  the  interval  of  x  of  our  interest.  In 

the  present  study,  we  use  d  =  0  and  C  =  3.5  .  The  coefficients, 
«*  ,  of  the  resultant  polynomial,  which  transforms  0  to  r  ,  are 
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TABLE  4-1 

Coefficients  of  the  Polynomial  of  Degree  7 
Obtained  by  the  Method  of  Moments  Using 
the  Interval  of  0  ,  (-4.0,  4.0)  to 

Approximate  the  Square  Root  of  the  Test 
Information  Function  of  Subtest  3. 


k 

ak 

0 

0.46408884D+01 

1 

0.60789659D-01 

2 

-0.41482735D+00 

3 

0.14684659D-01 

4 

0.51686862D-02 

5 

-0. 36903316D-02 

6 

0.21313602D-03 

7 

0.15726020D-03 
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shown  in  the  ascending  order  of  k  in  Table  4-2,  and  the  functional 

relationship  between  6  and  z  is  observed  in  Figure  4-1  .  The 

two  operating  characteristics,  P*  .  (r)  and  P*  (t)  ,  are  shown 

V-mrn  V-max 

by  solid  ar.d  dotted  lines,  respectively,  together  with  the  position 
of  the  critical  value,  (=  -0.5455),  in  Figure  4-2  . 

As  for  the  interval,  (r,t)  ,  seven  different  cases  were 
chosen  more  or  less  arbitrarily,  and  are  shown  as  Cases  1  through  7 
in  Table  4-3.  The  intervals  were  selected  in  such  a  way  that  we 
set  the  values  of  min) /T(n) 1  ,  the  lower  bound  of  the  square  root 
of  the  test  information  function,  and  then  corresponding  intervals, 
(0,0)  ,  were  determined,  and,  finally,  the  pairs  of  values,  t  and 
t  ,  were  obtained  through  (4.8).  In  addition  to  these  seven  cases, 
another  interval  of  t  ,  (-3.0,  3.0)  ,  was  added  as  Case  8  . 

The  number  of  hypothetical  examinees  for  each  of  the  eight 

cases  was  determined  in  the  following  way.  It  was  intended  that 
these  numbers  should  be  substantially  larger  than  five  hundred, 
which  was  used  in  the  preceding  chapter,  in  order  to  decrease  the 

error  caused  by  the  Monte  Carlo  method.  For  Case  8,  we  use  five 

thousand  hypothetical  examinees,  or  N  =  5,000  .  They  were  placed 
at  the  one  thousand  positions  of  r  ,  which  start  from  -2.997 
and  end  with  2.997  ,  witli  equal  steps  of  0.006  ,  with  five 
hypothetical  examinees  sharing  each  position.  For  Case  7,  out  of 
these  five  thousand  hypothetical  examinees,  those  who  were  located 
outside  of  the  interval,  (-2.8267,  2.8095)  ,  were  excluded.  Thus 
the  total  number  of  the  hypothetical  examinees  is  4,695  in  Case  7^ 


Operating  Characteristics  of  V-min  (Solid  Line)  and  V-ma 
(Dotted  Line)  Given  As  Functions  of  the  Transformed  Latent 
Trait  T  ,  Together  u  ith  the  Position  of  the  Critical 
Value,  t 
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with  the  exclusion  of  the  first  145  examinees  and  the  last  160 

examinees  from  the  original  five  thousand.  In  each  of  the  remaining 

six  cases,  the  number  of  hypothetical  examinees  was  determined  in  the 

similar  manner,  and  it  is  presented  in  the  last  column  of  Table  4-3. 

From  this  total  number  of  examinees,  N  ,  the  two  sample  sizes,  N^ 

and  N  ,  were  determined  in  each  case,  depending  upon  how  many 

examinees  were  positioned  below  and  above  the  critical  value,  xc 

(=  -0.5455)  .  These  numbers  are  also  presented  in  Table  4-3. 

Thus,  in  each  case,  these  hypothetical  examinees  can  be  considered  as 

a  sample  from  the  uniform  distribution  for  the  interval,  (t,t)  . 

Note,  however,  that,  because  of  the  way  the  examinees  were  selected, 

the  values  t  and  t  were  slightly  shifted  for  Cases  1  through  7  . 

The  new  endpoints  of  the  interval  of  x  are  presented  in  Table  4-5 

for  the  four  cases.  Cases  4  through  7. 

It  turned  out  that  for  the  first  three  cases,  Cases  1  through 

3  ,  the  two  frequencies,  NT„  .  and  N,„r  ,  are  so  small,  i.e., 

n  LV-min  HV-max 

1  and  3  for  Case  1,  1  and  10  for  Case  2  ,  and  8  and  19 

for  Case  3  ,  respectively.  This  is  due  to  the  fact  that  these  three 

intervals  of  x  are  relatively  small,  and  the  probability  with  which 

the  examinee,  whose  ability  level  is  within  each  interval,  obtains 

either  V-min  or  V-max  is  low.  With  these  small  frequencies 

substituted  in  (4.6),  we  obtained  such  absurd  results  for  x* 

V-mm 

and  x*  as  7.7998  and  -2.2507  for  Case  1  ,  11.3745  and 

V-max 

0.1132  for  Case  2  ,  and  -0.8183  and  1.4841  for  Case  3  , 
respectively.  It  is  obvious  that  we  should  not  take  these  results 


seriously,  and  we  must  conclude  that  these  three  intervals  of  t  are 

too  small  for  our  purpose  of  obtaining  the  two  estimates,  ty  and 

T*  .  In  all  eight  cases,  both  N  „  and  N,„.  turned 

V-max  LV-max  HV-max 

out  to  be  zero,  the  fact  that  indicated  the  success  in  selecting 
the  critical  value,  t 

c 

Table  4-4  presents  the  resultant  values  of  t*  and  t* 

r  V-min  V-max 

together  with  the  two  frequencies,  N„  .  and  N„  ,  for  each 

V-mm  V-max 

of  Cases  4  through  8  .  These  two  estimates  increase  in  absolute  values 
as  the  interval  becomes  larger,  as  is  expected  from  their  definitions. 

The  sample  regressions  of  t*  on  t  for  Cases  4  through  8 
are  presented  in  Figures  4-3  through  4-7,  respectively.  In  each  graph, 
the  mean  of  five  r*  's  corresponding  to  a  fixed  value  of  t  is 
plotted  as  one  point,  to  make  the  total  number  of  points  836  for 
Case  4  ,  861  for  Case  5  ,  907  for  Case  6  ,  939  for  Case  7  ,  and 
1,000  for  Case  8  ,  respectively.  We  can  see  that,  in  each  case, 
these  points  of  sample  regression  cluster  around  the  unbiasedness  . 
line,  or  the  straight  line  with  forty-five  degrees  from  the  abscissa 
passing  the  origin,  (0,0)  ,  which  is  shown  in  each  of  the  five 
f igures . 

Table  4-5  presents  the  sample  mean  and  variance  of  r  ,  those 
of  t*  ,  and  the  sample  product -moment  correlation  coefficient  of 
t  and  t*  ,  together  with  the  two  endpoints  of  the  interval,  (t,t) 
for  each  of  Cases  4  through  8  .  In  the  same  table,  also  presented 
in  parentheses  are  the  corresponding  expectations,  population 
variances  and  population  correlation  coefficients.  These  values 
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Two  Estimates,  t,*,  .  and  t*  ,  and  the  Numbers 

V-mxn  V-max 

of  Hypothetical  Subjects,  N.,  .  and  N„  ,  Who 

J  ’  V-nun  V-max 

Obtained  Either  of  the  Two  Extreme  Response  Patterns, 

V-min  and  V-max  ,  Respectively,  in  Each  of  the  Five 

Cases,  Cases  4  through  8. 


Case 

T* 

V-min 

N„  • 
V-nun 

— 

T* 

V-max 

4 

-1.6061 

23 

2.0856 

32 

5 

-2.0651 

39 

2.2750 

42 

6 

-2.4788 

81 

2.5455 

74 

7 

-2.6867 

145 

2.6865 

93 

8 


-2.8214 


258 


2.8596 


REGRESSION  OF 
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FTCURK  4-4 

Sample  Repression  of  t*  on  i  ,  for  861  Fixed  Values  of  x  . 

Case  r> 


SAMPLE  REGRESSION  OF 


SAMPLE  REGRESSION  OF  T 


SAMPLE  REGRESSION  OF 
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wcro  obtained  by  replacing  r  for  0  in  (3.6),  (3.7),  (3.10),  (3.11), 
(3.12)  and  (3.13),  and  repine  in;-,  i  *  for  1  in  the  last  four  of 
these  six  formulas,  and  by  using  C  ^  (=  0.081633)  for 

E{  Var . (t *| t ) I  .  We  can  see  that,  in  each  case,  these  sample 
means,  variances  and  correlation  coefficients  are  very’  close  to  the 
corresponding  population  parameters.  It  is  interesting  to  note, 
however,  that  there  is  a  mild  tendency  that  the  sample  variance  of 
x*  is  less  than  the  population  variance,  and  the  sample  correlation 
coefficient  between  t  and  t *  is  greater  than  the  population 
correlation  coefficient. 

Table  4-6  presents  the  two  coefficients  of  the  linear  regression, 

at  +  6  ,  of  i*  on  x  ,  or  the  best  fitted  line  in  the  least  squares 

principle,  for  each  of  Cases  4  through  8  .  We  can  see  that  the  first 

coefficient,  a  ,  is  very  close  to  unity,  and  the  second  coefficient, 

6  ,  is  very  close  to  zero,  in  each  of  the  five  cases,  and  the  linear 

regression  is  practically  the  same  as  the  unbiasedness  line,  or  the 

line  with  forty-five  degrees  from  the  abscissa  passing  the  origin, 

(0,0)  .  Evidently,  the  two  alternative  estimates,  t*  .  and 

v-mm 

x*  ,  turned  out  to  be  suitable  substitutes  for  the  negative 

V-max  ° 

and  positive  infinities  of  the  maximum  likelihood  estimate  so  that 
the  resultant  x*  be,  approximately,  conditionally  unbiased  for  the 
interval,  (x  ,  r )  ,  in  each  of  the  five  cases.  If  we  extend  the 
interval  beyond  (i  ,i)  ,  however ,  the  approximate  unbiasedness  of 
i  *  does  not  necessarily  hold.  The  expansion  of  the  interval  to 
(-3.0,  3.0)  for  Case  7  ,  for  example,  makes  the  1  inear  regression 
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0.98339x  4-  0.00043  ,  that  for  Case  6  makes  it  0.96849’  +  0.00563  , 

and  that  for  Case  5  makes  it  0.93922X  +  0.01638  ,  all  of  which  are 

flatter  than  the  original  linear  regressions. 

It  is  observed  that,  in  all  five  cases,  the  least  value  of 

the  finite  maximum  likelihood  estimates  which  our  hypothetical 

examinees  obtained  is  -2.6518  ,  and  the  greatest  value  2.7683  . 

These  two  values  are  larger  in  absolute  values  than  x*  .  and 

h  V-mxn 

T*  ,  respectively,  for  the  four  eases.  Cases  4  through  7  , 

V-max 

and  only  Case  8  provides  us  with  x*  .  and  x*  which  are 

1  1  V-mxn  V-max 

larger  in  absolute  values  than  these  two  finite  maximum  likelihood 
estimates.  This  fact  implies  that,  out  of  the  five  sets  of  intervals 
of  t  and  the  corresponding  pairs  of  alternative  estimates,  those 
of  Case  8  may  be  the  most  suitable  ones  for  Subtest  3  .  This  set  of 
alternative  estimates  also  gives  us  an  approximate  conditional 
unbiasedness  of  x*  for  truncated  intervals.  Figure  4-8  presents 
the  sample  regression  of  x*  on  t  in  Case  8,  for  the  truncated 
interval  of  x  ,  (-2.430,  2.586)  ,  which  is  the  same  as  the 

interval,  (x,x)  ,  in  Case  4  .  We  can  see  in  this  figure  that  the 
sample  regression  still  clusters  around  the  unbiasedness  line  for  this 
truncated  interval.  In  contrast  to  this.  Figure  4-9  presents  the 
sample  regression  in  Case  4  for  the  extended  interval  of  x  , 

(-3.0,  3.0)  .  The  awkward  shapes  of  clusters  around  the  two 
endpoints  of  the  extended  interval  of  x  indicates  that  the  two 
alternative  estimates  in  Case  4  fail  to  provide  us  with  an  approximate 
conditional  unbiasedness  of  c*  for  this  extended  interval  of  x  . 


3.0 


FIGURE  4-8 

Sample  Regression  of  t*  on  t  :  Case  8,  Using  the  Interval 
(-2.430,  2.586),  Instead  of  (-3.000,  3.000). 


FIGURE  4-9 


Sample  Regression  ot  t*  on  i  :  Case  4,  Using  the  Interval 

V 

(-1.000,  3.000),  Instead  ol  (-2.430,  2.58b). 
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The  error  score  for  each  individual  examinee  s  is  defined 

as  was  done  in  the  preceding  chapter,  by  replacing  by  , 

6*  by  T*  ,  and  1(0)  by  I*(T)  in  (3.17)  .  For  convenience, 
s  s 

the  same  symbol,  eg  ,  will  be  used  for  the  error  score  defined  for 

t*  .  Note  that,  in  the  present  situation,  the  test  information 

2 

function,  I*(t)  ,  is  constant  (=  3.5  )  for  the  interval  of  t  of 

our  interest,  instead  of  being  a  unimodal  function.  The  error  score 

e^  was  computed  for  each  of  the  4,180  examinees  of  Case  4  ,  and 

each  of  the  5,000  examinees  of  Case  8  .  The  frequency  distributions 

of  these  error  scores  are  presented  as  histograms  in  Figures  4-10 

and  4-11,  respectively,  with  the  category  width  of  0.2  ,  together 

with  the  standard  normal  density  function.  We  can  see  that  these 

two  histograms  are  much  closer  to  the  standard  normal  density 

function,  in  c^...parison  with  those  obtained  in  the  preceding  chapter 

upon  the  five  hundred  hypothetical  examinees.  It  is  also  noted  that 

these  two  resultant  histograms  are  substantially  different  from  each 

other,  in  spite  of  the  fact  that  4,125  error  scores  are  common 

for  these  two  histograms.  The  chi-square  test  for  the  goodness  of 

fit  of  these  two  frequency  distributions  against  the  standard  normal 

2 

density  function  gives  Xq  =  23.3491  with  29  degrees  of  freedom 

2 

(.70  p  -•  .80)  for  Case  4,  and  =  55.6856  with  the  same  number 

of  degrees  of  freedom  (.001  •:  p  .01)  for  Case  8  .  The  mean, 

variance  and  standard  deviation  of  the  error  score,  e  ,  are 

s 

0.0028  ,  1.0070  and  1.0035  for  Case  4  ,  and  0.0009  ,  0.9206  and 
0.9595  for  Case  8  ,  respectively.  It  is  interesting  to  note  that 


Frequency  Distribution  of  the  Error  Score,  eg  ,  Which  Is  Based  upon 

Subtest  3,  for  the  4,180  Hypothetical  Examinees  of  Case  4,  Compared 
with  the  Standard  Normal  Density  Function. 
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Subtest  3,  for  the  5,000  Hypothetical  Examinees  of  Case  8,  Compared 
with  the  Standard  Normal  Density  Function. 
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the  dispersion  of  the  error  score  in  Case  8  is  substantially  less  than 

'■k  * 

unity.  The  correlation  coefficient  between  r  and  is  -0.021 

for  Case  4  ,  and  -0.025  for  Case  8  .  The  sample  linear  regression 
of  the  error  score  eg  on  t  is  -0.01444t  +  0.00387  in  Case  4  , 
and  -0.01403t  +  0. 00093  in  Case  8  ,  both  of  which  are  very  close  to 
zero . 

The  pair  of  estimates,  p*'  .  and  p*'  ,  which  were 

IV-min  lV-max 

introduced  in  the  preceding  chapter,  were  also  obtained  with  respect 

to  T  ,  for  each  of  Cases  1  through  8  .  These  results  are  presented 

in  Table  4-7  .  We  can  see  that  for  larger  intervals  of  T  ,  like 

those  in  Cases  b  through  8  ,  the  resultant  estimates  are  similar 

to  the  corresponding  values  of  t*  .  and  t*  respectively. 

V-min  V-max  r  7 
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V  Discussion  and  Conclusions 

The  modified  maximum  likelihood  estimate,  6*  ,  and  its 
variation,  dy  >  have  been  introduced  and  investigated,  in  comparison 
with  Bayesian  estimates  and  another  population-free  estimate,  . 

The  former  of  these  two  newly  proposed  estimates  is  effective  when  a 
given  test  is  short,  like  LTS-D  ,  and  the  latter  is  useful  when  it 
is  longer  and  more  informative,  like  Sub test  3  . 

The  basic  idea  behind  this  research  is  to  admit  that  each  test 
has  a  certain  limited  range  of  ability  for  which  it  is  effective  in 
estimating  the  examinee's  ability.  Although  this  is  a  self-evident 
fact,  for  some  reason,  the  idea  has  not  fully  been  accepted  by  many 
researchers,  and  people  tend  to  use  tests  for  overly  wide  ranges  of 
ability,  and  turn  to  inappropriate  methods  like  Rayesian  estimation, 
in  order  to  make  the  result  plausible.  This  is  evidently  a  false 
solution,  using  the  pretense  that  the  test  has  measured  something, 
while  it  has  failed  in  so  doing,  and  it  is  mainly  an  arbitrarily  set 
prior  which  has  given  the  examinee  his  ability  score.  The  greatest 
fault  of  the  Bayesian  estimation  may  be  that  it  is  against  the 
principle  of  objective  testing,  since  it  contaminates  the  resulting 
ability  estimate  by  something  other  than  the  examinee’s  performance  in 
the  tost. 

The  conditional  unb  iasodncss  of  the  ability  estimate,  given 
ability,  is  by  far  t  lie  most  important  in  order  to  sustain  the  principle 
oi  the  objectivity  of  testing.  Taking  this  fact  in  mind,  we  can  still 
try  to  enhance  the  usefulness  of  a  given  test,  by  expanding  the  range 


of  ability  for  which  the  test  is  effective.  One  way  of  doing  this 
is  to  provide  a  suitable  estimator. 

The  maximum  likelihood  estimator  has  a  useful  characteristic 
of  asymptotic  conditional  unbiasedness.  For  less  informative  tests, 
however,  the  conditional  probability  with  which  the  examinee  obtains 
one  of  the  two  extreme  response  patterns,  V-min  and  V-max  ,  given 
ability,  is  substantially  high,  and  this  asymptotic  characteristic 
cannot  be  used  as  an  approximation.  Thus,  in  such  a  case,  we  will 
see  extreme  values  like  negative  and  positive  infinities  among  the 
maximum  likelihood  estimates  of  our  examinees:  the  fact  that  restricts 
the  effectiveness  of  the  test. 

The  two  modified  maximum  likelihood  estimates,  0*  and  6*  , 
which  were  proposed  and  discussed  in  the  present  paper,  were  conceived 
with  the  following  considerations  in  mind. 

(1)  We  follow  the  principle  of  the  objectivity  of  testing,  and, 
in  estimating  his  ability,  we  use  nothing  but  the  examinee's 
performance  on  the  test. 

(2)  The  resultant  estimate  provides  us  with  an  approximate 
conditional  unbiasedness  of  estimation. 

(1)  The  range  of  ability  for  which  the  test  is  effective  is 
enhanced . 

This  has  been  done  by  replacing  the  maximum  likelihood  estimates  for 

the  two  extreme  response  patterns,  V-min  and  V-max  ,  by  0* 

1  r  V-mxn 

and  0*  ,  or  by  0*  .  and  6*  ,  respectively.  The  results 

V-max  ’  J  V-ram  V-max  ’  v 
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proved  to  be  promising. 

One  distinct  feature  of  the  present  study  may  be  the  use  of 

the  Monte  Carlo  method  in  obtaining  0*  .  and  0*  .  One  may 

argue  that,  because  of  this,  we  cannot  avoid  the  sampling  fluctuations 

of  0*  .  and  6*  .  While  it  is  true,  this  can  be  minimized  by 

V-mn  V-max  y 

using  a  large  enough  sample  size.  In  the  present  study,  we  used  as 
large  a  sample  size  as  5,000  ,  and  this  can  be  made  even  larger,  if 
we  wish.  In  any  case,  even  if  we  use,  say,  ten  thousand  hypothetical 
examinees,  it  is  still  a  reduction,  considering  that  even  a  relatively 
short  test  like  Subtest  3  contains  as  many  as  14,348,907  different 
response  patterns. 

An  alternative  way  may  be  the  use  of  the  estimates,  p?' 

J  iv-mxn 

and  p*'  .  Since  these  estimates  do  not  depend  upon  the  Monte 

lV-nvax  r  v 

Carlo  method,  or  any  samples,  they  do  not  have  the  problem  of  sampling 

fluctuations.  The  approximate  conditional  unbiasedness  may  not  be 

reached  just  as  well,  however,  if  we  use  p*'  .  and  p*’ 

J  ’  ’  lV-mxn  lV-max 

instead  of  0*  .  and  0* 

V-min  V-max 
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SAMPLE  REGRESSION 


FIGURE  A-l 


Sample  Regression  of  the  Modified  Maximum  Likelihood 
Estimate,  6*  ,  for  the  One  Hundred  Ability  Levels  : 

Each  Point  is  the  Mean  of  Five  Values  of  9*  . 
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